A Study of Phone Recognizer Combination for Higher Accuracy in Timit Phone Recognition

نویسندگان

Supphanat KANOKPHARA

Julie CARSON-BERNDSEN

چکیده

Generally, phone recognition system contains only a single phone recognizer. The phone set and speech representation for a recognizer are optimized for a particular task. This paper studies the effect of phone sets and speech representations for TIMIT phone recognition task. Two phone sets (TIMIT original phone set and 39 classical phone set) and two speech representations (MFCC-based and PLP-based) are tested. The phone recognizers for each phone set and speech representation are experimented and analyzed on both TIMIT training and testing sets. The results show that the 39 classical phone set with PLP speech representation phone recognizer yields highest phone accuracy. However, this best phone set works well only on a particular phone subset while the other recognizers work better on the other subsets. Therefore, the combination of several phone recognizers indicates higher phone accuracy than any single phone recognizer.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High performance speaker-independent phone recognition using CDHMM

In this paper we report high phone accuracies on three corpora: WSJ0, BREF and TIMIT. The main characteristics of the phone recognizerare: high dimensional feature vector (48), contextand genderdependent phone models with duration distribution, continuous density HMM with Gaussian mixtures, and n-gram probabilities for the phonotatic constraints. These models are trained on speech data that hav...

متن کامل

Applications of virtual-evidence based speech recognizer training

We present two applications of our previously proposed virtualevidence (VE) based speech recognizer training algorithm [1, 2]. The first relates to two-pass training where segmentations obtained during the first pass are used as VE to train the subsequent pass. We use the TIMIT phone and SVitchboard continuous speech recognition tasks to demonstrate the benefits of using VE based training in tw...

متن کامل

Speech Recognition Using a Discriminative , Context - Independent , Segment - Based SpeechRecognizerJan

| In this paper, we describe important improvements that were recently introduced in our Discriminative Stochastic Segment Model (DSSM) speech recognizer. We propose a new presegmen-tation algorithm and we optimize the structure of the Multi-Layer Perceptron (MLP) that estimates the phone probabilities. Additionally, we describe a cascade MLP combination technique that relaxes the drawbacks of ...

متن کامل

Directed graphical models of classifier combination: application to phone recognition

Classifier combination is a technique that often provides appreciable accuracy gains. In this paper, we argue that the underlying statistical model of classifier combination should be made explicit. Using directed graphical models (DGMs), we provide representations of two common combination schemes, the mean and product rules. We also introduce new DGMs that yield novel combination rules. We fi...

متن کامل

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

A Study of Phone Recognizer Combination for Higher Accuracy in Timit Phone Recognition

نویسندگان

چکیده

منابع مشابه

High performance speaker-independent phone recognition using CDHMM

Applications of virtual-evidence based speech recognizer training

Speech Recognition Using a Discriminative , Context - Independent , Segment - Based SpeechRecognizerJan

Directed graphical models of classifier combination: application to phone recognition

شبکه عصبی پیچشی با پنجره‌های قابل تطبیق برای بازشناسی گفتار

عنوان ژورنال:

اشتراک گذاری